visual speech recognition application
A Comparison of Image Processing Techniques for Visual Speech Recognition Applications
We examine eight different techniques for developing visual rep(cid:173) resentations in machine vision tasks. In particular we compare different versions of principal component and independent com(cid:173) ponent analysis in combination with stepwise regression methods for variable selection. We found that local methods, based on the statistics of image patches, consistently outperformed global meth(cid:173) ods based on the statistics of entire images. This result is consistent with previous work on emotion and facial expression recognition. In addition, the use of a stepwise regression technique for selecting variables and regions of interest substantially boosted performance.
A Comparison of Image Processing Techniques for Visual Speech Recognition Applications
Gray, Michael S., Sejnowski, Terrence J., Movellan, Javier R.
These methods are compared on their performance on a visual speech recognition task. While the representations developed are specific to visual speech recognition, the methods themselvesare general purpose and applicable to other tasks. Our focus is on low-level data-driven methods based on the statistical properties of relatively untouched images, as opposed to approaches that work with contours or highly processed versions of the image. Padgett [8] and Bartlett [1] systematically studied statistical methods for developing representations on expression recognition tasks. They found that local wavelet-like representations consistently outperformed global representations, like eigenfaces. In this paper we also compare local versus global representations.
- North America > United States > California > San Diego County > San Diego (0.05)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
- North America > United States > California > San Mateo County > San Mateo (0.05)
- Asia > Middle East > Jordan (0.04)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision > Face Recognition (0.90)
- Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.82)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.71)